Twenty-One at TREC7: Ad-hoc and Cross-Language Track

نویسندگان

  • Djoerd Hiemstra
  • Wessel Kraaij
چکیده

This paper describes the o cial runs of the Twenty-One group for TREC-7. The Twenty-One group participated in the ad-hoc and the cross-language track and made the following accomplishments: We developed a new weighting algorithm, which outperforms the popular Cornell version of BM25 on the ad-hoc collection. For the CLIR task we developed a fuzzy matching algorithm to recover from missing translations and spelling variants of proper names. Also for CLIR we investigated translation strategies that make extensive use of information from our dictionaries by identifying preferred translations, main translations and synonym translations, by de ning weights of possible translations and by experimenting with probabilistic boolean matching strategies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Twenty One at TREC Ad hoc and Cross language track

This paper describes the o cial runs of the Twenty One group for TREC The Twenty One group participated in the ad hoc and the cross language track and made the following accom plishments We developed a new weighting algorithm which outperforms the popular Cornell version of BM on the ad hoc collection For the CLIR task we developed a fuzzy matching algorithm to recover from missing translations...

متن کامل

BBN at TREC7: Using Hidden Markov Models for Information Retrieval

We present a new method for information retrieval using hidden Markov models (HMMs) and relate our experience with this system on the TREC-7 ad hoc task. We develop a general framework for incorporating multiple word generation mechanisms within the same model. We then demonstrate that an extremely simple realization of this model substantially outperforms tf :idf ranking on both the TREC-6 and...

متن کامل

SPIDER Retrieval System at TREC7

This year the Zurich team participated in two tracks: the automatic-adhoc track and the crosslingual track. For the adhoc task we focused on improving retrieval for short queries. We pursued two aims. First, we investigated weighting functions for short queries|explicitely without any kind of automatic query expansion. Second we developed rules that automatically decide for which queries automa...

متن کامل

Applying Light Natural Language Processing to Ad-Hoc Cross Language Information Retrieval

In the CLEF 2005 Ad-Hoc Track we experimented with language-specific morphosyntactic processing and light Natural Language Processing (NLP) for the retrieval of Bulgarian, French, Italian, English and Greek.

متن کامل

Chinese Document Retrieval at Trec-6 1 Multilingual Document Retrieval in Trec

The TREC-6 conference was the fourth year in which document retrieval in a language other than English was carried out. In TREC-3, 4 groups participated in an ad hoc retrieval task on a collection of 208 Mbytes of Mexican newspaper text in the Spanish language. In TREC-4 there were 10 groups who participated, once again in an ad hoc document retrieval task on the same Mexican newspaper texts bu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998